Multidimensional data-driven classification of emission-line galaxies
Abstract
We propose a new soft clustering scheme for classifying galaxies in different activity classes using simultaneously four emission-line ratios: log ([N II]/H α), log ([S II]/H α), log ([O I]/H α), and log ([O III]/H β). We fit 20 multivariate Gaussian distributions to the four-dimensional distribution of these lines obtained from the Sloan Digital Sky Survey in order to capture local structures and subsequently group the multivariate Gaussian distributions to represent the complex multidimensional structure of the joint distribution of galaxy spectra in the four-dimensional line ratio space. The main advantages of this method are the use of all four optical-line ratios simultaneously and the adoption of a clustering scheme. This maximizes the use of the available information, avoids contradicting classifications, and treats each class as a distribution resulting in soft classification boundaries and providing the probability for an object to belong to each class. We also introduce linear multidimensional decision surfaces using support vector machines based on the classification of our soft clustering scheme. This linear multidimensional hard clustering technique shows high classification accuracy with respect to our soft clustering scheme.
- Publication:
-
Monthly Notices of the Royal Astronomical Society
- Pub Date:
- May 2019
- DOI:
- arXiv:
- arXiv:1802.01233
- Bibcode:
- 2019MNRAS.485.1085S
- Keywords:
-
- galaxies: active;
- galaxies: clusters;
- galaxies: emission lines;
- Astrophysics - Astrophysics of Galaxies
- E-Print:
- doi:10.1093/mnras/stz330